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Genomic clones coding for the Artemia franciscana 
sarco(endo)plasmic reticulum Ca-ATPase have been iso¬ 
lated. The restriction map of the overlapping clones cov¬ 
ers a region of 65 kilobases of DNA. Nucleotide sequence 
of mRNA coding regions shows that the gene is divided 
into 18 exons separated by 17 introns. Compared with 
the structure of the rabbit sarco(endo)plasmic reticu¬ 
lum Ca-ATPase 1 gene, 12 of the introns are in the same 
position, 8 introns present in the rabbit gene are absent 
from A. franciscana, 4 introns present in A. franciscana 
are not found in rabbit, and the position of 1 intron is 
shifted one base between both genes. Southern blot 
analysis strongly suggests that this is the only sarco(en- 
dolplasmie reticulum Ca-ATPase gene present in A. fran¬ 
ciscana. Primer extension and nuclease SI protection ex¬ 
periments have shown the existence of two main regions 
of transcription initiation separated by 30 nucleotides. 
Transcription is initiated in both regions at two or three 
consecutive bases. A hexanucleotide that includes the 
initiation sites is repeated in both transcription initia¬ 
tion regions. The nucleotide sequence of the promoter 
region shows the existence of several putative regula¬ 
tory sites, including some that are muscle-specific such 
as one CArG box, 3 MEF-2, and 8 putative binding sites 
for muscle transcription factors of the MyoD family. 


The sarco(endo)plasmic reticulum Ca-ATPases (SERCA) 1 are 
the enzymes responsible for the transport of calcium to the 
sarcoplasmic or endoplasmic reticulum after muscle contrac¬ 
tion or cell activation. Three different genes have been de¬ 
scribed in vertebrates that code for highly homologous SER- 
CAs. They are expressed with different tissue specificity and 
have been called SERCA1, -2, and -3, respectively (1). SERCA1 
is expressed in fast twitch muscles (2), SERCA2 in slow twitch, 
smooth, and cardiac muscles and in non-muscular tissues (3), 
and SERCA3 in some muscular and non-muscular tissues (1). 
Besides the existence of these three genes, further heterogene¬ 
ity of the enzyme isoforms is obtained by differential processing 
of the gene transcripts. The SERCA1 primary transcript re¬ 
tains its penultimate exon in adult muscles but splices it out in 
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neonatal muscles. The two mRNAs originated in this way code 
for proteins that differ exclusively in their C-terminal amino 
acids, -Gly in the adult isoform and -Asp-Pro-Glu-Asp-Glu-Arg- 
Arg-Lys in the neonatal one (4). The vertebrate SERCA2 pri¬ 
mary transcript can be also processed in different ways, de¬ 
pending on the use of two alternative polyadenylation sites and 
two internal donor splicing sites, so that up to four different 
mRNAs can be synthesized (5, 6). The four mRNAs code for 
only two different protein isoforms that also differ at their 
C-terminal region; the last 4 amino acids of the SERCA2a 
isoform are changed by 49 amino acids in the SERCA2b isoform 
(7-9). Alternative processing of the SERCA2 gene is tissue- 
specific; the mRNA coding for SERCA2a is expressed in slow 
twitch and cardiac muscles, while the mRNAs coding for 
SERCA2b are expressed in smooth muscles and non-muscular 
tissues (3, 5, 10). 

SERCA genes have been much less studied in invertebrates, 
but genes highly homologous to vertebrate SERCA genes have 
been described in Drosophila (11) and the crustacean Artemia 
(12). Drosophila has been shown to code for only one SERCA 
gene (13). Two different SERCA mRNAs are originated by al¬ 
ternative processing of a single gene transcript in Artemia. The 
two mRNAs code for protein isoforms that are homologous to 
vertebrate SERCA2a and -2b, differing only in their C-terminal 
region. The alternative splicing is also similar to the one de¬ 
scribed for vertebrate SERCA2 gene (14). 

The results obtained in the study of vertebrate and arthro¬ 
pod SERCA genes have shown the similarity of some of their 
characteristics, in particular, the high homology of their amino 
acid sequences (greater than 70%) and the conservation of simi¬ 
lar alternative processing events. There are also some differ¬ 
ences, such as the existence of three genes in vertebrates, that 
seem to have duplicated after the protostome/deuterostome di¬ 
vergence. These differences might have implications in the 
function and the regulation of the expression of these genes, 
which makes of interest a further study of invertebrate SERCA 
genes. Furthermore, our laboratory is interested in the study of 
the mechanisms of regulation of gene expression during the 
embryonic development of Artemia, and we have focused on 
this gene because of its high level of expression in muscle cells. 

In this article, we present the isolation of genomic clones 
coding for the complete sequence of this gene and its intron/ 
exon structure together with evidence suggesting that there is 
only one SERCA gene in this organism. The transcription ini¬ 
tiation site and the nucleotide sequence of the promoter region 
have been determined. The analysis of this sequence has shown 
the existence of several putative regulatory sites, some of which 
are consensus binding sites for muscle-specific transcription 
factors. 

MATERIALS AND METHODS 

Isolation of Genomic Clones — A total of 750,000 independent clones 
from a nonamplified A. franciscana genomic library (14) were screened 
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using the SERCA cDNA clones pArATCal51 and pArATCa412 (12) as 
probes. Hybridization of the filters was made in 6 x SSC, 50% formam- 
ide, 1% SDS, 5 x Denhardt’s, 100 ug/ml calf thymus DNA, and 10 6 
cpm/ml of the probe for 15 h at 42 °C. Filters were washed three times 
for 30 min each in 2 x SSC, 1% SDS at 65 °C. 

Southern Blot Analyses —Total DNA was purified from 20-h-old A. 
franciscana nauplii as described (15). Fifteen micrograms of the DNA 
were digested with the enzymes indicated in each case and analyzed on 
0.8% agarose gels. After electrophoresis, DNA was transferred to nylon 
membranes and hybridized to the cDNA probes. Two stringency condi¬ 
tions were used in the hybridization and washing of the filters. Under 
high stringency conditions, hybridization was made as described in the 
isolation of genomic clones, and the filters were washed two times for 30 
min in 2 x SSC, 0.1% SDS at 65 °C and once in 0.1 x SSC, 0.1% SDS at 
65 °C. When low stringency conditions were used, the hybridization 
temperature was 32 °C, and the filters were washed three times in 2 x 
SSC, 0.1% SDS at 65 °C. 

DNA Sequencing —The nucleotide sequence of the DNA was deter¬ 
mined by the dideoxy chain-termination method using the Taq Dye 
Deoxy Terminator Cycle sequencing kit and the 373A Sequencer from 
Applied Biosystems. In order to sequence gene exons, restriction frag¬ 
ments from the genomic clones that hybridized with the cDNA were 
cloned in the plasmid vector pUC18. The nucleotide sequence was de¬ 
termined from these plasmids by the use of universal primers if exons 
were located close to the end of the cloned restriction fragments or using 
specific oligonucleotides complementary to the cDNA or genomic se¬ 
quences if they were not. 

Primer Extension —Primer extension experiments were made accord¬ 
ing to Triezenberg (16). In summary, 75 pg of total RNA and 5 x 10 5 cpm 
of 3Z P-labeled primer oligonucleotide were used in each experiment. 
Hybridization of the RNA with the primer was made overnight at 30 °C 
and the primer extended with 50 units of avian myeloblastosis virus 
reverse transcriptase for 90 min at 42 °C. The extension products were 
analyzed on 6% polyacrylamide-7 m urea sequencing gels. Sequencing 
reactions made from a DNA of known sequence were used as markers to 
calculate the size of the extension products. 

Nuclease SI Protection —The identification of the transcription ini¬ 
tiation site by Si nuclease protection experiments was performed as 
described (17). The most 5' restriction fragment from the genomic clone 
gArATCa 1 that contains the first exon of the gene (from the Sail site in 
the EMBL-3 polylinkers to the first EcoRI site, see Fig. 1) was cloned in 
pUC18. The plasmid insert was labeled at the EeoRI site with T4 
polynucleotide kinase and digested with Sail, and the labeled EcoRI- 
Sall fragment was purified after electrophoresis in agarose gels. 4 x 10 4 
cpm of the labeled probe was hybridized with 100 pg of total RNA 
overnight at 40 °C. The hybridization mixture was diluted ten times in 
SI buffer and incubated for 1 h at 37 °C with the amounts of nuclease 
SI indicated in each case. The digestion products were analyzed in 6% 
polyacrylamide-7 m urea sequencing gels. The labeled EcoRI-SaZI frag¬ 
ment was degraded at purine residues by the method of Maxam and 
Gilbert (18) and used as marker to identify the size of the protected 
fragments. 

RESULTS 

Exon /Intron Structure of A. franciscana SERCA Gene — 
Genomic clones coding for A. franciscana SERCA gene were 
isolated by screening 750,000 independent clones of a genomic 
library (14) with the cDNA clones pArATCal51 and 
pArATCa412 (12). Low stringency hybridization conditions 
were used to isolate clones coding for homologous proteins in 
case there were more than one SERCA gene in A. franciscana. 
Over 70 independent positive clones were obtained and 16 of 
them selected for further analyses based on their hybridization 
to different fragments of the cDNA clones. These clones were 
plaque purified and their restriction map determined by single 
and double digestions with EcoRI, Sail, and Hindlll. The re¬ 
striction map of these phages showed that they overlap and 
cover a region of 69 kb of DNA (Fig. 1). 

Small differences between the restriction maps of overlap¬ 
ping clones were observed, which were not unexpected given 
that the genomic library was made from DNA obtained from a 
large population of animals. Some of the differences observed 
consist in restriction fragments of different size (connected by 
vertical lines in Fig. 1). Some examples are the 2.7- or 3.2-kb 


EcoRl-Sall fragment of clones 111, 29, 13, and 46; the 0.75- or 
0.95-kb EeoRI-ifmdIII fragment of clones 50, 63, 23, and 56 
and the 3.6- or 3.9-kb Ffindlll-Hiradlll fragment of clones 64, 
54, 25, and 57. These differences could be explained by the 
existence of insertions or deletions in different groups of the 
population. There are also some restriction sites that differ 
between overlapping clones. For example, there are EcoRI and 
SaZI sites in clones 13 and 46 that are not present in the 
overlapping clones 111 and 29. Another example is the EcoRl 
and Hindlll sites in the 5' half of clone 56 that are not present 
in the same location in clones 29, 13, and 46. In all these cases, 
the similarity of the fragments from different clones was con¬ 
firmed by hybridization or nucleotide sequencing. Except for 
these small differences, all the genomic clones studied are so 
similar that they must code for the same gene, suggesting that 
no other SERCA gene exists in A. franciscana. 

The next step in characterizing the gene was to determine its 
exon/intron structure. The large size of the DNA fragment cov¬ 
ered by the genomic clones suggested the existence of large 
introns, so that we sequenced only the exon-containing regions 
of the genomic clones. A number of oligonucleotides were syn¬ 
thesized, according to the nucleotide sequence of the cDNA, to 
localize and sequence the corresponding exons on the genomic 
clones. The first sequences allowed us to determine the position 
of some exons. This information was used to synthesize new 
oligonucleotides corresponding to the 5' end of the next exons 
that allowed their localization and sequencing. Antisense oli¬ 
gonucleotides complementary to the introns, close to the 3' end 
of each exon, were synthesized to sequence the antisense 
strand of the DNA. The location of each exon in the correspond¬ 
ing restriction fragment of the genomic clone was determined, 
after their cloning in pUC18, by sequencing both ends of the 
fragment or by PCR using the specific oligonucleotides and the 
pUC18 universal sequencing primers. These methods allowed 
the localization of all the exons and the determination of their 
nucleotide sequence. Exon contiguous regions, where splicing 
donor and acceptor signals are located, were also sequenced. 
The position of the sequenced exons is indicated in Fig. 1 as 
open boxes on the restriction maps of the A phages from which 
they were sequenced. Some exons were sequenced in more than 
one A phage, specially if there were differences in the restric¬ 
tion maps of overlapping clones, like phages 111, 29, and 13 or 
13, 29, and 56. 

A summary of the positions of all the exons and their sizes is 
also shown under the restriction map of the genomic clones in 
Fig. 1. The gene is divided into 18 exons; the smallest one is 101 
bp long (exon 3) and the longest one 864 bp (exon 17), with an 
average size of 230 bp. The size of exon 1 (91-122 bp) depends 
on the use of two possible transcription initiation sites (see 
below). The size of exon 17 (118-864 bp) is related to the use of 
an internal donor splicing site. If this site is used, the exon is 
118 bp long and is joined to exon 18 to code for one of the 
isoforms of the enzyme. If it is not used, exon 17 is 864 bp long 
and ends in a polyadenylation site so that exon 18 is not in¬ 
cluded, and this mRNA codes for the other isoform of the en¬ 
zyme. Genomic clones gArATCa 2, 5, and 11, as well as this 
optional splicing of exon 17, have been described previously 
(14). 

The exons are separated by 17 introns that range in size 
between 0.8 kb (intron 7) and 11.6 kb (intron 1) with an average 
size of 3.6 kb. Intron 1 is located in the 5' untranslated region 
of the gene while the other introns are situated in translated 
regions. 

The nucleotide sequence of all the exons was almost identical 
to that of the previously characterized cDNA clones and is not 
shown. There are a few changes with the cDNA clones, includ¬ 
ing 5 silent base changes and one conservative change (nucle- 
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Fig. 1. Restriction map of A. franciscana SERCA genomic clones and localization of their mRNA coding regions. The restriction map 
of 16 genomic clones that hybridized to A. franciscana SERCA cDNA clones was determined by single and double digestions with the restriction 
enzymes EcoRI (E), Sail (S), and HindUl (H). Genomic clones have been ordered from the 5' end of the cDNA (left) to the 3' end according to their 
overlapping restriction maps. Vertical lines join common restriction sites in genomic clones that show small differences in their restriction maps. 
In all these cases, the overlapping of the clones has been confirmed by hybridization, nucleotide sequencing, or both. The fragment from clone 
gArATCal used in nuclease SI protection experiments is indicated by a horizontal open box under its restriction map. mRNA coding regions were 
determined by hybridization with oligonucleotides complementary to the cDNA and sequencing and are indicated by vertical open boxes on the 
restriction map of the genomic clones from which they were sequenced. The number of each exon and its relative position and size are summarized 
in the linear diagram underneath the restriction map of the genomic clones. Exons 1 and 17 have two different possible sizes, depending on the 
origin of transcription utilized and the processing of the 3' end of the mRNA, respectively. 


otides 167 and 168 of the cDNA that change a Lys to an Arg). 
The position of each intron in relation to the amino acid se¬ 
quence deduced from the cDNA, the nucleotide sequence of the 
exon-flanking regions, and the size of each intron are shown in 
Fig. 2. The conserved GT and AG dinucleotides present in the 
donor and acceptor splicing signals, respectively, are shown in 
boldface type. 

In Fig. 3, we have compared the intron/exon structure of the 
A. franciscana SERCA gene with that of the other SERCA gene 
whose structure is known, the rabbit SERCA1 gene (19) and 
with the characterized exons of rabbit, human, and pig 
SERCA2 genes (6, 7, 20). In this figure, closed boxes represent 
exons containing untranslated regions of the mRNA and open 
boxes exons containing translated regions. Successive exons are 
alternatively placed above or under the middle line of the dia¬ 
grams to facilitate the localization of the introns. Alternative 
processing of the 3' end exons of these three genes are shown. 

Twelve of the 21 introns present in rabbit SERCA1 are also 
present in the same position of the A. franciscana gene (see Fig. 
3). Eight introns that are in rabbit SERCA1 are not in A. 
franciscana SERCA (indicated by thin arrows in Fig. 3) while 4 
of the A. franciscana introns are not present in rabbit SERCA1 
(indicated by arrow heads in Fig. 3). There is one intron whose 
position is displaced by one base pair between rabbit SERCA 1 
and A. franciscana SERCA genes (indicated by an asterisk in 
Fig. 3). The position of the last, optional, intron is identical in 
A. franciscana to the position of the optional intron 21 of rabbit 
SERCA 1 but is displaced 3 bp with respect to the equivalent 
optional intron of rabbit SERCA2 gene (indicated by double 
asterisks in Fig. 3). 


Analyses of the 5' End of the Gene —The nucleotide sequence 
of the region preceding the first exon of the gene was deter¬ 
mined and is shown in Fig. 4. The position of the initiator 
methionine and the first intron are also indicated in Fig. 4. The 
cDNA clone pArATCa412 extends to the EeoRI site located at 
the 3' end of the oligonucleotide 01iCa4 (underlined in Fig. 4). 

The site of transcription initiation was determined by primer 
extension and nuclease SI protection experiments. Primer ex¬ 
tension was carried out using total RNA from 26-h-old animals 
and two oligonucleotides complementary to SERCA mRNA, 
OliCal and 01iCa4, whose complementary sequences are un¬ 
derlined in Fig. 4. OliCal is located in the 5'-untranslated 
region, adjacent to the initiator methionine, in the second exon 
of the gene, while 01iCa4 is located in the first exon of the gene. 
The results obtained with these two oligonucleotides are shown 
in Fig. 5, A and B, respectively. The origins of transcription 
calculated from the fragments extended from both oligonucle¬ 
otides are identical, with the major bands corresponding to 
initiation at nucleotides -135 and -136 from the initiator me¬ 
thionine (shown with asterisks in Fig. 4). There is a second 
group of fainter bands also observed with OliCal and 01iCa4 
that could correspond with initiation at nucleotides -165 and 
-167 (also shown with asterisks in Fig. 4). Other bands ob¬ 
served in the primer extension with OliCal are not observed 
with 01iCa4. These experiments suggest the existence of a 
main site of transcription initiation at nucleotides -135 and 
-136 and a possible minor one at nucleotides -165 and -167. 

Nuclease SI protection experiments were done using the 
Sa/I-EcoRI most 5' fragment from phage gArATCal, where the 
Sail site is located in the EMBL-3 polylinker (see horizontal 
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73 74 
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111 112 
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E 1 S V G D 

157 158 
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235 236 
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367 

8. TTT AAT GAG 
F N E 

433 

9. TOG AAA AAG 
W K K 

485 

10. AAA TAT GAG 

k y e 

592 


R M 
368 

/gtgagtcatg- ( 1.7) -gtttctatttac*g/mT AAA CAA 
V K Q 
434 

/gtatattttt-( 6.5)-tgttttcctttt*5/GAA TIT ACT 
EFT 
486 

/gfcaatattat- ( 0.9) -ctttttatttctag/CAA AAC TCT 
Q N C 
593 


11. AGG OGA AIT G /gtaagttatc- ( 1.7)-tggtatttttatag/.CT GTC TTT 

R R I G V F 

643 644 

12. TOC GCT ATC /gtaagtcagt- ( 2.9) -tggatctatattag/TCT OCT GAC 

SAM T G D 

704 705 
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778 779 

14. TAT CAG TIG /gtgagtatac-( 5.3)-tgttttttttat«g/AGC CAT C PC 

y Q L S H H 

874 875 

15. GOC ATT AAC AG/gtaatgcttt-( 3.8>-acattcttttgt*g/. ,C TIG TOT 

AIN S L S 

918 919 

16. ATA CTG TCT /gtaagtagtt-< 1.8) -ctaatcttttcc^/ACT GTC TTC 

I L S T V F 

958 959 

17. TAT' ACC GAT G /gtatgccgct-( 4.8)-ccttcttttcacag/.AA TIT TCT 

Y T D E F S 

997 998 

Fig. 2. Nucleotide sequence of the exon/intron limits of A. fran¬ 
ciscana SERCA gene. The nucleotide sequence of the exon/intron 
borders has been determined for all the introns. Exon sequences are 
shown in capital letters and intron sequences in lower case. The encoded 
protein sequence is shown underneath the exon sequences. Exon/intron 
limits are indicated by diagonal lines. The conserved GT and AG 
dinucleotides present in the intron borders are shown in boldface type. 
The estimated size of each intron is indicated in kilobases. 


open box in Fig. 1). This EcoRI site is located at the 3' end of the 
01iCa4 sequence ( underlined in Fig. 4). The fragment was la¬ 
beled at the EcoKl site, hybridized to total RNA from 26-h-old 
embryos, and digested with increasing amounts of nuclease SI 
as indicated in the legend to Fig. 6. Nuclease SI protection 
experiments gave two main groups of protected fragments (Fig. 
6). The major group correspond to fragments protected up to 
the nucleotides -135 to -137 and the minor group to fragments 
up to nucleotides -164 to -168 (indicated by asterisks in Fig. 4). 
These results are in agreement with the data obtained by 


primer extension and indicate the existence of two sites of 
transcription initiation, one at nucleotides -135 to -137 and 
the other at nucleotides -165 to -167. 

The results obtained do not allow the assignment of the 
transcription initiation site to a single nucleotide of each re¬ 
gion, suggesting the existence of some heterogeneity in the 
transcription initiator nucleotide for both initiation sites. The 
analyses of the sequence of the two initiation sites shows the 
existence of a hexanucleotide (ACTTAT) that is present at both 
sites (shown in boldface type in Fig. 4). The bases of the 
trinucleotide TTA of this hexanucleotide seem to be the prefer¬ 
ential places of initiation of transcription in both sites. The 
strongest, most 3' initiation site (nucleotides -135 to -137) is 
preceded in 25-30 bp by a potential TATA box close to the 
canonical sequence (TATAAG, nucleotides -167 to -162, under¬ 
lined in Fig. 4). The nucleotide sequence preceding in 25-30 
bases the weaker, most 5' initiation site (nucleotides -165 to 
-167) contains a potential TATA box less related to the canoni¬ 
cal sequence (TATTTA, nucleotides -201 to -196, underlined in 
Fig. 4). There is then a good correlation between the presence 
of a more canonical TATA box and the higher frequency of use 
of the initiation site. 

Several putative protein binding sites have been localized on 
the sequence that is located upstream of the initiation sites 
using the computer program MacPattem. These putative regu¬ 
latory regions are underlined in Fig. 4. The putative regulatory 
region that is repeated more times in this sequence (eight 
times) is the sequence CANNTG, corresponding to the consen¬ 
sus binding site of muscle-specific transcription factors of the 
MyoD family (21). The presence of these binding sites could be 
functionally important, since SERCA genes are transcribed at 
high levels in muscle. Other binding sites present in this pro¬ 
moter region that have also been shown to be important for the 
expression of some muscular genes are the CArG box (22) pres¬ 
ent at positions -410 to -400 and three putative MEF-2 binding 
sites (23), underlined in Fig. 4. A 45-nucleotide long poly(A) 
stretch is present at positions -467 to -423 that could be anal¬ 
ogous to the Poly(A/T) stretches described in the promoter of 
the rabbit SERCA2 gene (20). There are also four repetitions of 
the binding site for the general transcription factor TCF-1 and 
single binding sites for other non-tissue-specific factors such as 
H4TF-1, AP-2, GRE, CBF, and E2A. 

Southern Blot Analyses of the Number of SERCA Genes — 
Since vertebrates have three SERCA genes and Drosophila has 
only one, it was of interest to determine the number of A. 
franciscana SERCA genes to better understand the evolution 
and regulation of this gene family in arthropods. All the 
genomic clones that we have analyzed overlap between them 
and seem to code for only one gene. Similar experiments lead to 
the isolation of both SERCA1 and SERCA2 genomic clones in 
vertebrates (20) so that we think that the results obtained 
suggest the existence of a unique SERCA gene in A. fran¬ 
ciscana. Tb further test this hypothesis, Southern blot analyses 
were done under high (0.1 x SSC, 65 °C) and low (2 x SSC, 
65 °C) stringency conditions. A cDNA fragment expanding 
three exons of the gene that codes for a region highly conserved 
in all known SERCA genes (amino acids 705-821, 90% homol¬ 
ogy between A. franciscana and rabbit proteins) was used as 
probe. The results obtained using three different restriction 
enzymes to digest the DNA are shown in Fig. 7. The same 
restriction fragments hybridized to the probe under high and 
low stringency conditions and were the ones predicted from the 
restriction map of the genomic clones (Fig. 7). 

Similar experiments have also been made using as probe a 
cDNA fragment that codes for exon 7 (88% homology between 
A. franciscana and rabbit proteins) with identical results; only 
the restriction fragments corresponding to the isolated genomic 
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Fig. 3. Comparative study of the intron/exon boundaries of A. franciscana and mammalian SERCA genes. The intron/exon bound¬ 
aries of rabbit SERCA1, the known boundaries of human, rabbit, and pig SERCA2, and those of A. franciscana SERCA genes are compared. Protein 
coding exons are represented by open boxes and exons coding for untranslated sequences as black boxes. The length of each box is proportional to 
the exon size. Consecutive exons are alternatively placed at one or the other side of the middle line of the diagrams to show up intron/exon 
boundaries. A. franciscana SERCA gene exons are numbered. Each of these three genes code for two protein isoforms, depending on the exons that 
are included in the C-terminal region of the protein. The exons used for the two isoforms are represented as SERCAla and -lb, SERCA2a and -2b, 
and Artemia 4.5- and 5.2-kb mHNAs. Intron positions that are not labeled are identical in A. franciscana and mammals. Thin arrows represent 
the position of introns that exist in rabbit SERCA1 and not in the A. franciscana gene. Thick arrowheads indicate the position of introns present 
in A. franciscana and not in rabbit SERCA1. The asterisk indicates a difference of 1 bp between the position of this intron in rabbit SERCA1 and 
A. franciscana. The double asterisk indicates a difference of 3 bp in the intron position between A. franciscana and rabbit SERCA2, although the 
position of this intron is the same in A. franciscana and rabbit SERCA1 genes. 


clones hybridize either under high or low stringency conditions 
(data not shown). A decrease in the temperature of hybridiza¬ 
tion (32 °C) and washing (45 °C) of the filters did not reveal any 
additional hybridization (data not shown). These results indi¬ 
cate that if a second SERCA gene exists in A. franciscana , it 
should have diverged from the characterized gene more than 
vertebrate genes have and strongly support the hypothesis that 
a single SERCA gene exists in A. franciscana. 

DISCUSSION 

The isolation and characterization of 16 overlapping genomic 
clones coding for the A. franciscana SERCA is presented in this 
article. Several data suggest that this is the only SERCA gene 
encoded in A. franciscana. Northern blot experiments show 
that SERCA cDNA clones hybridize to two mRNAs, both of 
which have been shown to be generated by different processing 
of the primary transcript of this gene (14). Additionally, the 16 
characterized genomic clones overlap between them and code 
for a single gene. Only one nonoverlapping clone could be iden¬ 
tified and coded for a pseudogene. 2 Finally, Southern blot ex¬ 
periments showed that two cDNA fragments highly conserved 
between SERCA genes hybridize exclusively with the restric¬ 
tion fragments corresponding to the characterized gene, even 
under low stringency hybridization conditions. 

The structure of A. franciscana SERCA gene has been ana¬ 
lyzed. It is divided into 18 exons and is 65 kb long. Only another 
SERCA gene has been completely characterized, the rabbit 
SERCA1 (19). A partial characterization of rabbit (20), human 
(7), and pig (6) SERCA2 genes has also been established. In 
comparison, rabbit SERCA1 gene is shorter than A. fran¬ 
ciscana gene (16.5 versus 65 kb) but is divided into a larger 
number of exons (23 versus 18). Rabbit SERCA2 gene is over 45 
kb long, and the known partial intron/exon organization of the 
pig, human, and rabbit genes is identical to that of rabbit 
SERCA1 except for the two C-terminal exons. 

Twelve of the 21 introns present in rabbit SERCA1 also exist, 
in the same position, in A. franciscana SERCA gene. One more 
intron is displaced just one base between both genes. Eight of 
the rabbit SERCA1 introns are absent from A. franciscana, 
and, on the contrary, four introns of the A. franciscana gene, 


2 R. Escalante and L. Sastre, unpublished results. 


including the intron present in the 5'-untranslated sequence, 
are not present in rabbit SERCA1. These data show an impor¬ 
tant conservation of intron position during the evolution of this 
gene (over 50% of the intron positions are conserved). There are 
two alternative interpretations of intron evolution, and the 
data obtained about the SERCA genes are compatible with both 
of them. One interpretation is that all the introns were present 
in the ancestral genes and are being lost during evolution (24). 
According to this hypothesis, the ancestral SERCA gene, pre¬ 
cursor of the three vertebrate genes and the unique A. fran¬ 
ciscana gene, should have had at least 26 introns; eight were 
lost in the evolution of A. franciscana and four in the evolution 
of the vertebrate SERCA1 gene. The alternative interpretation 
is that introns are added to the genes during evolution (25). If 
this hypothesis is correct, the ancestral SERCA gene would 
have had only the 12 introns that are common to rabbit and A. 
franciscana genes. Eight more introns would have been added 
to the gene during divergent vertebrate evolution and four to 
the A. franciscana gene. We have not considered the intron 
whose position is changed one base pair, since some authors 
consider that both would be the same intron (26) and some 
others consider them as different introns (27). 

The first hypothesis on intron evolution would also predict 
that they have a functional meaning, since structural or func¬ 
tional protein domains would be encoded in single exons in the 
ancestral gene (28). This hypothesis does not correlate well 
with the intron/exon structure of rabbit SERCA1 gene, be¬ 
cause some functional domains that are conserved in all P-type 
ATPases and some transmembrane domains are interrupted by 
introns (19). Some of these domains are not interrupted by 
introns in the A. franciscana gene, namely the (3-sheet domain 
(intron 6 of SERCA1), the transduction domain (intron 8 of 
SERCA1), the fluorescein isothiocyanate binding site (intron 13 
of SERCA1), and the transmembrane domain M7 (intron 17 of 
SERCA1). The other introns that interrupt structural or func¬ 
tional domains, such as the transmembrane domains Ml and 
M5, are present both in rabbit SERCA1 (introns 3 and 16) and 
in A. franciscana. The four introns exclusive of A. franciscana 
do not interrupt any known structural or functional domain of 
the protein. Taken together, we think that the data on SERCA 
introns evolution favor the hypothesis that at least some of the 
introns have been added during evolution since structural and 
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-932 AACGCACTTT GGCACCCACA CATACAC GAT TTCCCATGCC TTCAGACCTC 

H4TF-1 AP-2 

-882 ATAAATGTTA CAAATAAAAT TTTCGTTTTT ATCCTTT TGT TCT CTTTTGG 

GRE 

-832 TTTATATTTT TTATAACATT CAAGAAATTG TTGTCGGCTC TTCACCTAAA 

-782 AACACATCAC TCTTCGATCT CCACTAAACC TGATCCTAAA CATATATAAT 

-732 TGTTTTTCAA ATAATTTT CA AAGG CGTATA TAACAACGAT ATACCTTTCT 
TCF-1 

-682 GATGAATAGA ATAAATAGAA TAT TAATTTT A TAAGATTTG TTTTTTGTTT 
MEF-2 

-632 TTTTAAGCAT CCCCTTTGAA CTTT CAAAG C CACCGCA CAA ATG TCTATGA 
TCF-1 CANNTG 

-582 CATCGTTTTC TATCCTCTTT CCC CAAATG T TAATTGGTGC CCATAGTTGT 
CANNTG 

-532 A TAAAATTA C AGG CAATTG T GTTTTATAGC CCAATTG TTT AGCCTGTTAG 
MEF-2 CANNTG CANNTG CCAAT 

-482 CATTATAACT TGCTT AAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 

-432 AAAAAAAAAA GGGGTCTTAC ATC CCATTAA AGGG GACGCT CCTCATTACT 
TCF-1 CArG 

-382 CGT TATTTTT A TT CAAGTG T CGTTGCTATG CTTCTTTACA ACTATTTCTC 
MEF-2 CANNTG 

-332 GCAATAATAT ATGACCAGAA TTGTGTTTTG ATAGTTTGT C AAATGA TCCC 

CANNTG 

-282 CATCAATTTT TCTAATTCAT TG CACCTG TC GAATACTTTC AGAAACTCCA 
CANNTG E2A TCF-1 

-232 CAGGTGC TTG ATTCTGCGCA TTTCTCTCTT G TATTTAG AA ATAGTAAGAA 
CANNTG TATA 

-168 -164 -137 -135 

-182 AACGAGTGGT ATAC TTATAA GGTTTCCATT TGGCTCTTGA AGAACTTATT 
TATA 

-132 CGTGCCGAGT TCGTCAGGGG GAAAAGGCCG AAATA CCGGC AGTGCCTA CT 

01iCa4 

-82 GGAATTC GAC GGATCGAAGT TACTGGATAT CTTCTTG/AGCAGGACTATTA 

Intron 

-32 AATAATAGTC C TTCATCAAG CAGTCATCCA AG ATG 
OliCal -1 Met 

Fig. 4. Nucleotide sequence of the 5' end of the A. franciscana 
SERCA gene. The nucleotide sequence of the 5'-untranslated region of 
the SERCA mRNA and the promoter region of the gene is shown. Nucle¬ 
otide 1 is the A of the initiation codon ATG. The limit between the first 
and second exons is indicated by a diagonal line. The oligonucleotides 
used in primer extension experiments (OliCal and 01iCa4) are under¬ 
lined. The repeated hexanucleotide, where transcription initiation sites 
are located, are indicated in boldface type. The main initiation sites are 
indicated by asterisks and their positions shown (-135 to -137 and -164 
to -168). Putative regulatory sequences are underlined and the consen¬ 
sus sequence or probable transcription factor binding to the region 
indicated underneath the sequence. Putative TATA boxes are also un¬ 
derlined. 

functional domains that are encoded by single exons in A. fran¬ 
ciscana are interrupted by introns in rabbit. It seems more 
likely that these domains would have been encoded by single 
exons in the original gene and that the introns that interrupt 
them would have been added during vertebrate evolution. 

The origin of transcription of the A. franciscana SERCA gene 
has been determined by primer extension and nuclease SI pro¬ 
tection experiments. Both techniques gave similar results, in¬ 
dicating the existence of two origins of transcription separated 
by 30 bp. Transcription initiation is heterogeneous within each 
of these two origins with two or three consecutive bases being 
used as initiation sites. These results have been obtained by 
primer extension using two different oligonucleotides and by 
nuclease SI protection experiments. The existence of more 
than one initiation site has been described for other genes and, 


Fig. 5. Primer extension analyses of the A. franciscana SERCA 
transcription initiation sites. Seventy five micrograms of total RNA 
from 26-h-old nauplii were hybridized to radioactively labeled oligo¬ 
nucleotide OliCa 1 ( panel A ) or 01iCa4 ( panel B ) and incubated with 50 
units of avian myeloblastosis virus reverse transcriptase. The location 
of these oligonucleotides is indicated in Fig. 4. The synthesized products 
were analyzed on a 6% polyacrylamide-7 m urea sequencing gel ( lanes 
OliCal and OliCa4, respectively). Lanes M show the migration of se¬ 
quencing reactions used as markers to calculate the size of the extended 
products. The positions of the main initiation sites, calculated from the 
A of the initiation codon, are indicated in the right margin of the figure. 

in general, has been related to the absence of a TATA box in the 
gene promoter. In the case of the A. franciscana SERCA gene, 
where a TATA box can be identified in the promoter, the exist¬ 
ence of two initiation regions could be explained by the similar 
nucleotide sequence of the two regions and the presence of 
putative TATA boxes at 25-30 bases of each of them. 

As mentioned before, the A. franciscana SERCA gene is tran¬ 
scribed into two different mRNAs that are originated by alter¬ 
native processing events and that might have different tissue 
specificity of expression. There is the possibility that one of the 
origins of transcription is used to transcribe one of the mRNAs 
and the other origin for the second mRNA. To try to answer this 
question, primer extension experiments were made with RNA 
from several developmental stages (6, 8, 20, and 26 h) at which 
the relative level of expression of the two mRNAs is quite 
different (12). At 6 or 8 h of development, both mRNAs are 
expressed at similar levels, while at 20 and 26 h, the smaller 
mRNA (4.5 kb) is expressed at levels 5-10 times higher than 
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Fig. 6. Nuclease SI protection analyses of A. franciscana 
SERCA gene transcription initiation site. The most 5’ SaZI-EcoRI 
fragment of the genomic clone gArATCal, where the Sail site is located 
in the polylinker of the EMBL-3 vector (see Fig. 1), was labeled at its 
EcoRI site. This EcoRI site is also shown at the 3' end of the 01iCa4 
sequence in Fig. 4. The isolated labeled fragment was hybridized to 100 
pg of total 26-h-old nauplii RNA and digested with increasing amounts 
of nuclease SI as follows: lane I, 5 units; lane 2, 10 units; lane 3, 25 
units; lane 4, 50 units; lane 5, 100 units; and lane 6 , 300 units. In lane 
7, the probe was hybridized with 100 pg of yeast RNA and digested with 
100 units of nuclease SI. In lanes 8 and 9, the probe was incubated with 
100 pg of yeast and nauplii RNA, respectively, but the incubation was 
made in the absence of nuclease SI. Lane M shows the Maxam and 
Gilbert A + G degradation reaction of the probe that was used to identify 
the nucleotide sequence of the protected bands. The upper panel shows 
the region of migration of the probe (2 kb), and the lower panel shows 
that of the nuclease SI protected fragments. The position of the main 
transcription initiation sites, calculated considering as base 1 the A of 
the initiation codon, is shown at the right side of the figure. 

the larger mRNA (5.2 kb). The relative intensity obtained for 
the two initiator regions were the same at all the developmen¬ 
tal stages (data not shown), suggesting that the use of one or 
the other initiator regions is not related to the expression of the 
4.5- or the 5.2-kb mRNAs. 

The analysis of the nucleotide sequence of the promoter re¬ 
gion has shown the existence of several putative regulatory 
sequences. There are two putative TATA boxes preceding each 
of the transcription initiation regions. The possible TATA box 
whose sequence is closer to the consensus (TATAAA) precedes 
the initiation region closer to the coding regions of the gene that 
is the one used more frequently, according to the primer exten¬ 
sion and nuclease SI protection experiments. There are also 
other putative general regulatory regions, including a CCAAT 
box and TCF-1, E2A, AP-2, and H4TF-1 binding sites. The 
sequence analyses also shows the existence of putative muscle- 
specific regulatory regions; in particular, there are eight 
CANNTG motifs that are possible binding sites of muscle tran¬ 
scription factor of the MyoD family (21). There is a CArG ele¬ 
ment as well that in addition to being the binding site for the 
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Fig. 7. Southern blot analyses of A. franciscana SERCA gene 
copy number. Fifteen micrograms of A. franciscana DNA were di¬ 
gested with EcoRl ( E ), Sa/I (S), or HindlU ( H ), analyzed on a 0.8% 
agarose gel, transferred to nylon membranes, and hybridized to a 
EcoRl-Ndel cDNA fragment that includes exons 12-14 of the gene. 
Duplicate filters were hybridized and washed either under high strin¬ 
gency conditions ( panel A ) or low stringency conditions ( panel B ). The 
size of the DNA markers (in kb) is indicated. Panel C shows the restric¬ 
tion map of the genomic clones that code for exons 12-15 and the 
position of the probe used in this experiment on the restriction map of 
the cDNA clones. 

serum response factor, has been found in several muscle-spe¬ 
cific gene promoters (22). Three putative MEF-2 binding sites, 
other muscle-specific transcription factor (23), have also been 
found in this promoter. Some of these muscle-specific regula¬ 
tory regions are also present in vertebrate SERCA1 and 
SERCA2 promoters, like the CANNTG motif that is found four 
times in SERCA1 (19) and five times in SERCA2 (20). A CArG 
box is also found in SERCA2. A long poly(A) stretch found in A. 
franciscana SERCA promoter could be analogous to several 
Poly(A/T) stretches that are present in rabbit SERCA2 pro¬ 
moter. 

As mentioned before, the available data support the hypoth¬ 
esis that the SERCA gene described in this article is the unique 
A. franciscana SERCA gene. In that case, the encoded protein 
must be expressed in the sarcoplasmic reticulum of muscle cells 
and in the endoplasmic reticulum of non-muscle cells. The pres¬ 
ence of putative general and muscle-specific regulatory regions 
in the promoter would be in agreement with this hypothesis. 
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The main muscle-specific regulatory region found is the con¬ 
sensus binding site for transcription factors of the MyoD family, 
suggesting that these factors can play an important role in the 
regulation of the expression of muscle genes in A. franciscana. 
No MyoD homolog has been described in A. franciscana, but the 
existence of a homologous gene has been described in another 
invertebrate, Drosophila (gene nautilus (29)), showing that this 
gene family is not restricted to vertebrates and most probably 
is also encoded in A. franciscana. The existence of CANNTG 
motifs in the promoter regions of vertebrate SERCA1 and 
SERCA2 genes suggest that their expression could be regu¬ 
lated by transcription factors of the MyoD family and that 
these regulatory mechanisms would have been conserved for 
SERCA genes during evolution. Sukovich et al. (30) have re¬ 
cently described a 17-bp long upstream promoter element of the 
rabbit SERCA2 gene important for its transcription in skeletal 
muscle cells; we have not found this element in A. franciscana 
SERCA promoter, indicating that although some common ele¬ 
ments might have been conserved during evolution, the regu¬ 
lation of the expression of these genes is not identical in both 
species. In any case, the existence of conserved sequence motifs 
is only indicative, and functional analyses of the SERCA pro¬ 
moters would be necessary to further understand the evolution 
of the regulatory mechanisms of the expression of these genes. 
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